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ABSTRACT. This paper considers the nonparametric maximum likelihood estimator (MLE) 
for the joint distribution function of an interval censored survival time and a continuous 
mark variable. We provide a new explicit formula for the M LE in this problem. W e use this 
formula and the mark specific cumulative hazard function of lHuang &: Louisl ([1998 ) to obtain 
the almost sure limit of the MLE. This result leads to necessary and sufficient conditions 
for consistency of the MLE which imply that the MLE is inconsistent in general. We show 
that the inconsistency can be repaired by discretizing the marks. Our theoretical results are 
supported by simulations. 
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1. Introduction 



Suppose that X i s a survival tim e and Y is a continuous mark variable which may be cor- 
related with X. iHuang fc Louisl (|1998l ) considered nonparametric estimation of the joint 



distribution of X and Y when X is subject to (random) right-censoring and the mark vari- 
able Y is observed if and only if X is uncensored. In many cases of interest, however, 
we can only observe an interval c ensor ed version of the random variable X. For exam- 
ple, iHudgens. Maathuis Gilbertl ([20071 . henceforth HMG) analyzed an HIV vaccine trial in 
which X is the time of HIV infection and Y is a measure of the genetic distance between the 
infecting HIV virus and the virus in the vaccine. The participants of this trial were tested for 
HIV at several follow-up times. As a result, X was interval censored, that is, only known to 
be in a time interval determined by the follow-up times. Moreover, since the viral distance 
Y could only be determined for HIV positive individuals, Y was missing for all individuals 
who were HIV negative at their last follow-up visit. 

Motivated by this example we consider the following model, that we refer to as the 
"interval censored continuous mark model" . Let X > be a survival time and let Y G R be 
a continuous mark variable. For a fixed integer k > 1, suppose that T = (Ti, . . . ,T)%) is a 
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vector of observation times with distribution G. We assume that < T\ < ■ ■ ■ < and that 
T is independent of (X, Y). We cannot observe (X,Y) directly. Instead, our observed data 
are W = (T, A, Z), where 

A = (Ai > ...,A fc+ i) with Aj = l{Tj-\ < X < Tj}, j = l,...,k + l, 



(with the convention that Tq = and T^+i = oo), and 



Z = A + Y with A + = ^ Aj. 

3=1 

Note that the vectors T and A determine a time interval (Tj—\,Tj], j = l,...,fc + l, that is 
known to contain the survival time X. The variable Z reflects that the mark variable Y is 
observed if and only if the survival endpoint is reached before the last observation time, i.e., 
if and only if X < Tf.. 

Our censoring model for X is called "interval censoring case fc" , since each ind i yidua l 
in the study has exactly k observation times T\ , . . . , Tk (see iGroeneboom fc Wellnerl (119921 ) 



for case 1 and case 2 interval censoring, and IWellnerl ()1995l ) for case k interval censoring). 
Interval censoring case 1 is also referred to as "current status censoring", since we only ob- 
serve the "current status" of an individual at a single observation time. A model which 
allows the number of observation times to be random, and hence to vary across i ndivid - 
uals in the study, is called "m ixed c ase interva l censoring" (see e.g. ISchick & Yul (|2000l ). 



Van der Vaart k Wellnerl (|200d ). and 



Sun 



(|2006l . page 12)). 

Our goal here is to study the nonparametric maximum likelihood estimator (MLE) of the 
joint distribution Fq of (X,Y) when the observations consist of W\, . . . , W n i.i.d. as W . In 
particular we focus on consistency issues, and we show, in fact, that the MLE is inconsistent 
in general. 

There are se yeral known example s of inconsistency of the nonparametric maximum likeli- 
hood estimator. 



Barlow et al 



(|197l pages 255 - 258) showed that the MLE F n for the class 
of star-shaped distributions (distributions on [0, b) with F(0) = and F{x)/x non-decreasing) 
is inconsistent, by showing that for sampling from the uniform distribution on [ 0. 1] the MLE 
F n (x) —>„..,. x 2 . For distributions F with increasing failure rate average (IFRA), Bovles et al 



(1985) showed that the MLE is inconsistent, and they identified the limit explicitly for sam- 
pling from a general continuous distribution function F. In the context of bivariate right- 
censored data, inco nsistency of the no nparametric MLE for con tinuous bivaria t e dist ributions 
was pointed out by 



Tsai et al. 



(|1986l ) and was also studied by IVan der Laanl ([1996 ). For es- 
timation of a distribution function on R based on left-truncated and case 1 interval censored 



data. iPan &: Chappelll (|1999i ) showed that the nonparametric MLE is inconsistent. Finally, 
Maathuisl (|2003l . Section 6.2) showed inconsistency of the MLE of the bivariate distribution 
of (X, Y) when X is subject to current status censoring and Y is observed exactly. 
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There are many more examp les of inconsistent maxim u m likeliho o d estimators in p a ramet - 
ric problems: see, for example. iNeyman k Scottl (119481 ). iBahadurl (119581 ) . iFergusonl (11 982). 
Ghosh k Yang ( 1995 ). Gupta et al. ( 19991 ). and the interesting review by Le Cam ( 1990 ). 

To relate our inconsistency result to some of these earlier studies of inconsistency of the 
MLE, note that observation of W instead of (X, Y) can be regarded as observation of a 
(random) set A known to contain the unobservable (X, Y). We call such a set an observed 
set. In our model the observed sets can take two forms. When Aj = 1 for some j < k (so 
A + = 1), then the observed set is a horizontal line segment: 



A = (T j .i,T j ]x{Z}, 



(1) 



while when Ak+i = 1, or equivalently, when A+ = 0, the observed set is a half plane: 

A = (T k ,w)xR. (2) 

The line segments that arise when A + = 1 are an indicator of potential consistency problems 
for the MLE, since such line segm e nts a lso occurred in the inconsistent MLEs studied by 
Van der Laanl (|1996l ) and iMaathuis! ([20031 . Section 6.2). This prompted us to carefully study 
consistency of the MLE for interval censored continuous mark data. 

Our work is also related to the classical competing risks model, in which one studies the 
failure time X of a system that can fail from a (finite) number of J competing risks given 
by values of Y € {1, . . . , J}. The variable Y in this model can only be observed after the 
failure event happened, and is therefore a mark variable. Thus, the classical competing risks 
model can be called a "discrete mark model", and can be viewed as the discrete counter- 
part of the continuous mark model. The competing ri sks model has been studied under 
various censoring assumptions for X. lAalenl (|1976l . Il978l ) and iKalbfleisch &; Prentice! (|1980l . 
§7.2, pages 163 - 178) studied the MLE in this model when X is subject to right censor- 
ing. Th e generalization to interval c e nsore d sur yival data with competing risks was consid - 
ered by iHudgens. Satten k Longinil (|200ll ) and iJewell. Van der Laan k Hennemanl (|2003l ) . 



Jewell k Kalbfleisch 



with competing risks, and 



([2004 ) studied c ompu t ational issues of the MLE for curren t status data 



Maathuis (ho06h . iGroeneboom. Maathuis k Wellnerl ((2006a), and 



Groeneboom. Maathuis k Wellnerl (|2006bl ) derived the asymptotic properties of the MLE in 
this model. 

In the current paper we focus on the interval censored continuous mark model. In Section^ 
we derive a new formula for the MLE in this model, using connections with univariate right 
censored da ta. In Section [5] we use this new formula and the mark specific cumulative hazard 
function of iHuang & Louis! (|1998) to derive the almost sure limit of the MLE. This result 



leads to necessary and sufficient conditions for consistency of the MLE which force a relation 
between the unknown distribution Fq and the observation time distribution G. Since such a 
relation will typically not hold, it follows that the MLE is inconsistent in general. In Section 2] 
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we show that the inconsistency can be repaired by discretizing the marks, an operation that 
transforms the data into interval censored competing risks data. In Section [5] we support our 
theoretical results by simulations of the MLE and the repaired MLE. Section [6] contains a 
discussion of some remaining issues. Technical proofs are collected in the Appendix, Section [5J 



2. Explicit formula for the MLE 

HMG noted a close connection between the MLE for univariate right censored data and the 
MLE for interval censored continuous mark data. We use this connection in Section 12.21 
to derive a new explicit formula for the MLE for interval censored continuous mark data. 
But first, in Section 12.11 we review univariate right censored data in a way that shows the 
similarity between the two models. 



2.1. Intermezzo: univariate right censored data 

Suppose that we want to estimate the distribution Fq of a survival time X, and suppose 
that X is subject to right censoring. Thus, instead of n i.i.d. copies of X, we observe n i.i.d. 
copies of (min(X, T), 1{X < T}), where T is a random censoring time with distribution G. 
We assume that T is independent of X. It is well-known that the MLE F n of Fq in this model 
is given by the Kaplan-Meier estimator. 

We now review the Kaplan-Meier estimator in a way that allows us to easily make a 
connection with interval censored continuous mark data. We first introduce some notation 
and terminology. Define U = min(X, T) and A = 1{X < T}, and let (Ui, Ai), . . . , (U n , A n ) 
denote n i.i.d. copies of (?7, A). Recalling the discussion of observed sets in Section [TJ each 
observation (U, A) defines an observed set A that is known to contain X: A = {U} if A = 1, 
and A = (U, oo) if A = 0. Let Un), . . . , Ui n \ be the order statistics of U±, . . . , U n , and let 
A(j) and An\ be the corresponding values of A and A. We assume that all Ai with Aj = 1 
are distinct, since this will be the case for the continuous mark data. However, we allow ties 
in the T's and C/'s provided that this assumption is not violated. We break such ties in U 
arbitrarily after ensuring that observations with A = 1 are ordered before those with A = 0. 

By assuming that F has a density / with respect to some dominating measure fi, the 
likelihood (up to multiplicative terms depending only on G) is L n (F) = YYi = \q{Ui, Aj), 
where q(u, 5) = f(u) s {1 — i^u)} 1-5 . Since the first term of q is a density-type term, L n (F) 
can be made arbitrarily large by letting / peak at some value Ui with Aj = 1. This problem 
is usually solved by maximizing L n (F) over the class of distribution functions that have a 
density with respect to counting measure on the observed failure times. We can then write 
L n {F) = niLi PpjAj ), where Pf(A) is the proba bility of A under F. 

It is well-known (jPetol . I1973 ; iTurnbul] , Il97d l that the MLE in censored data problems 



can only assign mass to a finite number of disjoint regions, called maximal intersections 
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by IWong k, Yul ()1999l ). iMaathuia (|2005l ) introduced an efficient algorithm to compute the 
maximal intersections for d-variate interval censored data. This algorithm is based on a 
height map h : M. d — > N of the observed sets, where h{x) is defined as the number of observed 
sets that contain x. Maathuis showed that the maximal intersections correspond exactly to 
the local maximum regions of the height map of the observed sets. (If there are ties in the 



Maathuif 



observ ed sets, then these need to be resolved before applying the height map, see 
WW-) 

The height map h : M i— >• N for univariate right censored data is illustrated in Figure [TJ 
Note that h(x) simply represents the number of observed sets A%, . . . , A n that overlap at the 
point x. It is clear that all sets Au\ with i G I = {i G {1, . . . ,n} : A^\ = 1}, or in other 
words, all sets of the form A({\ = {U^}, are local maxima of the height map. Hence, all such 
sets are maximal intersections, and we denote these by M^, i G T. This notation may seem 
redundant since Mm = A^, but it will be useful in Section [2.21 Furthermore, if and only if 
A( n ) = 0, the height map has an extra local maximum region Ai n ) = (U( n ),oo), resulting in 
an extra maximal intersection M( ra+1 ) = (Ur n \, oo). This situation is illustrated in Figure [TJ 
Let I be the collection of indices of all maximal intersections. Thus, X = I if A( n ) = 1 and 
T = JU{n + l} if A (n) =0. 

[Figure [TJ about here.] 

Let pi be the probability mass of maximal intersection Mu\, i G T. We can then write 
the likelihood in terms of the pis: 



\{P(Ai) = n ( X>ii{M w c A {1) } = ]Jp 

Kiel ' i=1 




l-A 



(0 



(3) 



where the second equality follows from the fact that the data are ordered with respect to the 
variable U = min(X, T). The MLE p maximizes this expression under the constraints 



iei 



1 and pi > for all i G I. 



(4) 



It is well-known that p is the Kaplan- Meier or product-limit estimator, given by 



Pi 



n 



3=1 



A 



0) 



A 



(i) 



n — j + 1 J n — i + 



i G J, 



and Pn+i = 1 — J2ieiP i ^ ^(n) = ( see f° r example IShorack k, Wellnerl (|1986l ). Chapter 7, 
pages 332-333). Equivalently, we can write 



e »= n 1 

j>i,jei j<t-i 



A 



U) 



n — j + 1 J ' 



% G X. 
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The vector p is uniquely determined. We obtain F n {x) by summing all probability mass of 
p that falls in the interval (0, x\. It is well-known that F n (x) is non-unique for x > Ur n -\ 
if and only if A( n \ = 0. This is caused by the fact that the MLE is indifferent to the 
dis tribution of mass with i n a m aximal intersection, called "representational non-uniqueness" 
by iGentleman &: Vandall (|2002l ). Since all maximal intersections {M^ : i el} are points, 
this non-uniqueness occurs if and only if Mi n+ \\ = (Ur n \, oo) exists, and this happens if and 
only if A (n ) = 0. 



2.2. Continuous mark data: Explicit formula for the MLE 

We now return to the interval censored continuous mark model given in Section (TJ and 
introduce some additional notation. Let Fq(x, y) = P(X < x, Y < y) be the joint distribution 
of (X,Y), and let F QX (x) = F (x,oo) = P(X < x) and F 0Y (y) = F (oo,y) = P(Y < y) be 
the marginal distributions of X and Y, respectively. Recall that G denotes the distribution 
of the observation times T. We use subscripts to denote the marginal distributions of G. 
For example, G\ is the distribution of T\ and (^2,3 is the distribution of (T2,T^). For current 
status censoring (k = 1), we denote the observation time simply by T. 

We study the MLE F n of F , based on n i.i.d. copies Wi, ■ ■ ■ , W n of W, where Wi = 
(Tj,Aj,Zj), Tj = (Tu, . . . ,T k i) and A, = (A u , . . . , A k+ i ti ). We allow ties between the 
observation times of Tj and Tj for i 7^ j. 

The observed sets A in this model are given in equations (pQ) and ([2]). Recall that A is a 
line segment if A + = 1 and that A is a half plane if A + = 0. Assuming that F has a density 
/ with respect to some dominating measure fix x My, the likelihood (up to multiplicative 
terms only depending on G) is given by L n (F) = Y\™ =1 q(Wi), where 

k 

f(s,z)iu x (ds)\ (i-Fxitk)) 1 - 5 ^ 



q(w)=q(t,6,z) = TT < / 



■tj} 



and Fx(x) = F(x,oo) is the marginal distribution of X under F. Since the first term of 
q is a density-type term, L n (F) can be made arbitrarily large by letting f(s,z) peak at 
z = Zi for some observation with A + j = 1. We therefore define the MLE F n (x,y) to be 
the maximizer of L n (F) over the class T of all bivariate distribution functions that have a 
marginal density /y with respect to counting measure on the observed marks. We can then 
write L n (F) = UtlMA 



Analogously to iMaathuisI (|2005l ). we call the projection of A on the x-axis the x-interval 



of A. We denote the left endpoint and right endpoint of the x-interval by L and R: 

k+l k+l 

3=1 3=1 
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Furthermore, we define a new variable U that will play an important role in our analysis: 



U = A + R + A k+1 L. (5) 

Let • • • , Uf n \ be the order statistics of Ui, ...,{/„ and let A.u\ = (A^, • • • , 
Zu\, A™, Lu\ and Ru\ be the corresponding values of A, Z, A, L and R. We break ties 
in U arbitrarily after ensuring that observations with A + = 1 are ordered before those with 
A + = 0. Recall that the maximal intersections are the local maximum regions of the height 
map h : M 2 i— > N of the observed sets. Since Y is continuous, the observed sets Au\ with 
i G I = {i € {1, . . . ,n} : A+U) = 1} are completely distinct with probability one. Hence, 
each such Au\ contains exactly one maximal intersection Mu\ of the form: 



M (i) = (D(i),R(i)] x {%)}, where 

D {i) = max{{L (j) : j (£ 1, j < i} U {L@}}. 

To understand this expression, let Su) be the collection of observed sets A(j\ with A + q-) = 
and Lu\ < Lu\ < Rm. If = 0, then the height map is constant on Au\, and the complete 
set Au\ is a local maximum region. Hence, in this case Mu\ = Au\ and Du\ = Luy On 
the other hand, if Su\ ^ 0, then the height map is increasing on A^ in the x-direction. 
Hence, in this case Mu\ C Au\ and the left endpoint of Mu\ is max{L(j) : Au\ G Su\}, which 
equals max{L(j) : j £ X, j < i}. Note that the right endpoints of Mu\ and Au\ are always 
identical. Moreover, note that the equations in ([6]) imply that the maximal intersections can 
be computed in 0(n log n) time, since the most computationally intensive s t ep co nsists of 



sorting the data. This is faster than the height map algorithm of iMaathuisI (|2005l ). due to 
the special structure in the data. 

Analogously to the situation for univariate right censored data, there is an extra maximal 
intersection Mr n+ u = At n \ = {U/ n \, oo) X R if and only if A + ( n ) = 0. Let I be the collection 
of indices of all maximal intersections. Thus, X = I if A + ( n ) = 1 and I = IU{n+l) if 
A+( n ) = 0. Let pi be the probability mass of maximal intersection M^, i £ I. Then the 
likelihood can be written as 

n I \ n I \1- A +M 

n p (^) = n Ep^mu) c = up? + ^ e Pi - ( ? ) 

*=i *=i \jei J 1=1 \j>i+l,jez / 

where the second equality follows from the fact that the data are ordered with respect to 
the variable U which was defined in ([5]). The MLE p maximizes this expression under the 
constraints (JH). From the analogy with the likelihood ([3]) it follows immediately that 

~l V n— j + ljn — i + l 
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and Pn+i = 1 — Yli&iPi ^ ^+(n) = 0- Equivalently, we can write 

Eft-n(i-^r). o 

j>i,jeT r « i v 

These formulas are different from (but equivalent to) the ones given in Section 3.1 of HMG. 
The form given here has several advantages. First, the tail probabilities ([9]) can be computed 
in time complexity 0(n log n), since sorting the data is the most computationally intensive 
step. Furthermore, the current form provides additional insights about the behavior of the 
MLE. In particular, it shows that the MLE can be viewed as a right endpoint imputation 
estimator (see Remark [T]) , and it allows for a derivation of the almost sure limit of the MLE 
(see Section [3|). 

The vector p is uniquely determined. This was noted by HMG and also follows from 
our derivation here. We obtain F n (x,y) by summing all probability mass of p that falls in 
the region (0,x\ x (— oo,y]. We define a marginal MLE for the distribution of X by letting 
Fxn{%) = F n (x,oo). The estimators F n and Fx n can suffer considerably from representa- 
tional non-uniqueness, since the maximal intersections {Mu\ : i £1} are line segments, and 
the potential maximal intersection M( n+ i\ is a half plane. We let F^ denote the estimator 
that assigns all mass to the upper right corners of the maximal intersections, since it is a 
lower bound for the MLE. Similarly, we let F™ denote the estimator that assigns all mass to 
the lower left corners of the maximal intersections, since it is an upper bound for the MLE. 
The formulas for F% and F Xn can be written as follows: 

i -**„(*)= n fi-^fiV <"» 



U(i)<x 
n 



^i(x,y) = ^pil{U(i) < < y} 



i=i 

v n fi- A+0) ) A +(^^y\ (11) 

11 \ n _j + iJ n-i + l 



U(i)<x U U) <U (i) 
using (JHj) , © and the definition of U in (|5 



Remark 1. The MLE F% can be viewed as a right endpoint imputation estimator. To see 
this, consider creating a new collection of observed sets ^uy 



A , = f {U (l) } x {z (l) } if iei, 



A {i) if 



That is, for each i = 1, . . . , n, we replace by its right endpoint if A + ^) = 1, while we 
leave it unchanged if A + ^) = 0. The intersection structures of {A^}™ =1 and {A'^}™ =1 are 
identical, meaning that A^ n Ay) = if and only if A',^ n A'u\ = 0, for all i,j G {1, . . . , n}. 
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Furthermore, the maximal intersections of {^l^liLi are {M',^ = A',^ : i € I}. Hence, writing 
the likelihood for the imputed data in terms of p yields exactly the same likelihood as (|7|). 
This implies that the maximizing vector p> is identical to the vector p for the original data. 
Moreover, the upper right corners of {M^}, i € X and {M^}, i € 2" are identical. Since F% 
assigns all mass to the upper right corners of the maximal intersections, it follows that F% is 
completely equivalent to the MLE for the modified data. Finally, note that the right endpoint 
imputation scheme imputes an x- value that is always at least as large as the unobserved X. 
This explains why the MLE F^ n tends to have a negative bias. 



3. Inconsistency of the MLE 

In this section we derive necessary and sufficient conditions for consistency of the MLEs F t Xn 
and F% (Theorem [1]) . These conditions force a relation between the unknown distribution 
Fo and the observation time distribution G. Since such a relation will typically not hold, it 
follows that F% is inconsistent in general. Corollary Q] further strengthens this result when 
X is subject to current status censoring, and shows that in that case F^ n is inconsistent for 
any continuous choice of -Fo and G. Corollary [2] shows that the asymptotic biases of F% and 
F% converge to zero as the number k of observation times per subject increases, at least for 
one particular distribution of T%, . . . , T^. 

The results in this section are based on deriving the limits Ft-— and F^ for the lower 
bounds F^ n and F^ of the MLE. The reason for looking at these lower bounds is that F% 
and F^ can be expressed in simple closed forms (see f)10[) and (jlip ). Moreover, in many cases 
representational non- uniqueness disappears in the limit, so that the limits of Fx n an d F n are 
unique and equal to F^^ and F^ . Necessary and sufficient conditions for uniqueness of the 
limit are: (i) all maximal intersections Mu\, i El, converge to points, and (ii) ^2i £ jPi — > 1 as 
n — > oo. These conditions are satisfied in Examples 1 and 2 in Section If these conditions 
fail, then the upper bounds -F^oo and F^ can be obtained from their lower bounds by 
reassigning mass from the upper right corners of the maximal intersections to the lower left 
corners. This o ccurs in Examples 3 and 4 in Section and further details can be found in 



Maathuisl (|2006l . Section 9.4). 

In order to derive F^^ and F^ we start by rewriting equations (fl~0|) and (fTTj) in terms 
of stochastic processes. We introduce the following notation: 

M n (x) =¥ n l{U<x}, x>0, 
Y n (x,y) =P n A+l{C7 <x,Z <y}, x > 0, y G M, (12) 

V & (x) =Y n (x,oo)=¥ n A + l{U<x}, x>0, 

where U is defined in ([5]) and ¥ n f(X) = n _1 Yli=l /(^)- Furthermore, let 
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a i \ f ^n{ds,y) ~ - /" Vxn(ds) , 1Q . 
A n (x,y) = 7- . and A Xn {x) = A n (x, oo) = / 7- r - (13) 

J[0,x] 1 - ^n[S-) J[0,x] 1 - M nlS-J 

Since 

A (A \ ^+l{U = x,Z< y} ^ P W A + 1{^ = x} 

A " (dx ' y) = P ra l{l7 > x} and Ax " (dx)= P w l{^>x} ' 

we can write equations (jlOp and (jlip in terms of Ajc n and A n : 

l-^ n (x) = n{l-Axn(*»)}. (14) 

s<x 

F*(x,y)= f l[{l-Axn(du)}A n (ds,y). (15) 

J s<x u<s 

Note that (|14p is analogous to the Ka plan-Meier estimator fo r right censored data, and that 



151) is analogous to equation (3.3) of iHuang fc Louis! (119981 ). However, our functions Axn 



and A n are defined differently, since they are based on the variable U. This difference lies at 
the root of the inconsistency problems of the MLE. 

The limits of the processes M n , V n , Yx n , A n , Ax n , F^ and F x are given in the Appendix 
(Lemmas [1] - [3j) and are denoted by H, V, Vx, Aoo, Axoo> F^ and F Xoo , respectively. 
Corollaries - [5] in the Appendix provide various alternative ways to express F^. 

We are now ready to give necessary and sufficient conditions for consistency of F Xn and 
F%, after introducing the following notation: 



H(x) = Vx(x)+ [ {l-F x(s)}dG k (s), (16) 

J[0,x] 

V(dx,y) =J2 F o(x,y)dG j (x)-J2 ! F (s,y)dG j - ld (s,x), (17) 
3=1 j=2 - y [M 

V x (dx) = Y j F 0X {x)dG j {x) - V / F 0X (s)dG j - 1 j(s,x), (18) 



3=1 i=2 

see equations (f22j) - (pl|) in the Appendix. Moreover, throughout this section we let r be 
such that H{t) < 1, we define 0/0 = and f(x—) = lim^ f(t) for any function / : R 1— > M. 

Theorem 1 T/ie MLE is inconsistent in general. The MLE F Xn is consistent for Fox on 
(0, r] if and only if the following condition holds for all x G (0, t]: 



a , , f Vx(ds) f 

Axoo(x) = / — r = / 



F ox (ds) _ , . 

= A x(aO. ( 19 ) 



[o,x] 1 - H (s-) J[ ^ 1 - F x(« 

The MLE F% is consistent for Fq on (0, rjxlt/ and only if the following condition holds for 
all x <E (0,t] ; y G K; 
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a / \ f V{ds,y) f F (ds,y) 

Finally, let xq € (0,r] Fxoo(£o) > 0. Then F^(xq, y) / F^ n {xQ) is consistent for Foy(y) 
if X and Y are independent. 

Proof. The one-to-one correspondence between a univariate distribution function and its cu- 
mulative hazard function implies that Fi- is consistent for Fqx if and only if Ax 00 (equation 
([26]) in the Appendix) equals the cumulative hazard function Aox of Fox- This gives con- 
dition (|19p . Similarly, it follows that F^(x,y) is consistent for Fo(x,y) if and only if Aqo 
(equation (|25p in the Appendix) equals the mark specific cumulative hazard function Aq of 
Fq. This gives condition (|20|) . The final claim of the theorem follows from equation (|32|) in 
the Appendix. □ 



Note that conditions (119p and (|20p are difficult to interpret, since Fqx and Fq enter on 
both sides of the equations when we plug in expressions (fT6j) - (fT8|) for H(s-), V(ds,y) 
and Vx(ds). However, it is clear that the conditions force a relation between the unknown 
distribution Fq and the observation time distribution G. Such a relation will typically not hold 
and cannot be assumed since Fq is unknown. Hence, it follows that the MLE is inconsistent in 
general. The following corollary further strengthens this result when X is subject to current 
status censoring. 

Corollary 1 Let X be subject to current status censoring, and let Fqx and G be continuous. 
Then the MLE Fy is inconsistent for any choice of Fqx and G. 

Proof. Let 7 = inf{x : Fqx(x) > 0} < r. Since X is subject to current status censoring and 
since the distributions G and Fqx are continuous, condition (fTUj) can be rewritten as 



dG(s) f dFox(s) 



'(j,x] 1 - G(s) 7 (7ii .] F ox (s){l - Fox(s)} 7 
This integral equation is solved by 



x G (7,7-]. 



-log{l - G(x)} + C = logj ^^M , x 6 ( 7) t]. 

This yields Fqx(x) = [l + exp(— C){1 — G(x)}]^ 1 for x £ (7,7"]. Since there is no finite C such 
that Fox (7) = holds, it follows that condition (|19p fails for all continuous distributions G 
and Fox- D 

Finally, we show that the asymptotic bias of the MLE converges to zero as the number k 
of observation times per subject increases, for at least one particular distribution of T = 
(Ti, . . . , Tfc), namely if T%, . . . , are distributed as the order statistics of a uniform sample 
on [0, 8]. The proof of this result is given in the Appendix. 
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Corollary 2 Let X be subject to interval censoring case k, and let the elements Ti, . . . ,Tfc of 
T be the order statistics of k independent uniform random variables on [0,9]. Let V k (x,y), 
V x {x), H k {x), A k K) (x,y) and A Xoo (x) denote the limits defined in LemmasU\ and\^ using the 
superscript k to denote the dependence on k. Then 

J[o,x] 1 - H k (s-) J [0:X] 1 - Fqx{s-) 

A k i x f V k (ds,y) f F (ds,y) . . 

AZ,[x,y) = / T-. r — > / ; — - = Ao{x,y), k — > oo, 

/or all continuity points of Aqx and Aq with x < and y £ R. 



4. Repaired MLE via discretization of marks 



We now define a simple repaired estimator F n {x, y) which is consistent for Fq(x, y) for y on a 
grid. The idea behind the estimator is that one can define discrete competing risks based on 
a continuous random variable. Doing so transforms interval censored continuous mark data 
into interval censored data with competing risks. 

To describe the method, we let K > and define a grid — oo = yo < y\ < ■ ■ ■ < yx < 
yK+i = oo. Next, we introduce a new random variable C € {1, . . . , K + 1}: 

K+l 

We can determine the value of C for all observations with an observed mark. Hence, we can 
transform the observations (T, A, Z) into (T, A, Z*), where Z* = A+C. This gives interval 
censored data with K+l competing risks. 

Since the observed sets for interval censored data with competing risks form a partition 



of the space 
and 10 of 



x{l,...,K + l|, H e 



linge r consistency of the MLE follows from Theorems 9 



Van der Vaart Wellneii (|2000l ). Under some additional regulari ty conditions, we 



can derive local and uniform consistency from the Hellinger consistency, see iMaathuisI (|2006l . 
Section 4.2). This means that we can consistently estimate the sub-distribution functions 
Foj(x) = P(X < x,C = j) = P(X < x, yj-i < Y < yj), x € R+. Hence, we can consistently 
estimate Fo{x,yj) = Yje=i ^oe( x ) f° r x G ^+ an d Vj on the grid. 

Note that the introduction of the variable C causes more overlap between observed sets, 
since previously non-overlapping horizontal line segments may overlap if they are assigned 
the same value of C . As a result, the repaired MLE has smaller maximal intersections in the 
x-direction. Hence, the repaired MLE is affected less by representational non-uniqueness on 
the x-axis. This is visible in Examples 3 and 4 in Section 



The repaired MLE can be computed with one of the algorithms described in lGroeneboom. Maathuis &: Welln 
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(|2006al . Section 2.4). It may be tempting to choose K large, such that Fo(x,y) can be esti- 
mated for y on a fine grid. However, this may result in a poor estimator. To obtain a good 
estimator one should choose the grid such that there are ample observations for each value 
of C. In practice, one can start with a coarse grid, and then refine the grid as long as the 
estimator stays close to the one computed on the coarse grid. 

In principle it is possible to estimate the entire joint distribution function Fq(x, y) for (x, y) 
in the interior of the support of the distribution of the observation times under smoothness 
assumptions on Fq. This would proceed by letting both K and the yfs defining the partition 
all depend on n in such a way that K = K n — > oo, 



max (y j+ i t n - Vj.n) -> 0, and 
i<7<a„-i 



™ m jn AVj+l,n -Vj,n) 



OO, 



as n 



oo. 



It would even be possi ble to choose K„ and fa, n l depending on the data 



via m odel-selection methods (see, e.g.. lBirge h Massartl (| 19971 ) and lBarron. Birge Massart 
(1999)), but these further developments are beyond the scope of the present paper and will 
be investigated in detail elsewhere. 



Maathuis! (|2006l ). lGroeneboom. Maathuis Wellnerl (|2006al ) and lGroeneboom. Maathuis fc Wellner 



(|2006bl ) showed that the MLE for current status data with competing risks converges at 
rate n 1//3 to a new self-induced limiting distribution. This result implies that one can use 
subsampling to construct point wise confidence intervals for the sub-distribution functions 
(jPolitis. Romano &: Wola (|1999l )). This method is also valid for the repaired MLE for cur- 
rent status data with continuous marks, and can be used for the construction of pointwise 
confidence intervals for Fq{x^) for y on the grid. The limiting distribution of the MLE for 
more general forms of interval censoring with competing risks has not yet been established, 
and in such cases the use of subsampling is therefore not yet justified. 



Jewell. Van der Laan Hennemanl (|2003l ) and iMaathuisI (|2006l . Chapter 7) studied esti- 



mation of a family of smoo th functional s of th e sub-distribution functions for current status 

(|2003l ) suggested that their "naive est i mato r" yields 



Jewell et al 



data with competing risks, 
asymptotically efficient estimators for these smooth functionals, and IMaathuisI (|2006l ) showed 
that the same is true for the MLE. These results extend to the repaired MLE for current sta- 
tus data with continuous marks. Asymptotic properties of estimators of smooth functionals 
for more general forms of interval censoring with competing risks are currently still unknown. 



5. Examples 

In this section we support the theoretical results of Sections [3] and U] by simulations. In 
particular, we show support for our claims that F^ — > a .s. F^ : F™ — > a .s. F^ and F n — > a . s . Fq. 
Moreover, we show that the difference between the true underlying distribution Fq and the 
limits of the MLE F^ and F^ can be considerable. We give four examples that cover a wide 
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range of scenarios. They include cases where X and Y are independent (Ex. 1) or dependent 
(Ex. 2 - 4), where X is subject to interval censoring case 1 (Ex. 1, 2) or case 2 (Ex. 3, 4), 
and where the distribution of T is continuous (Ex. 1 - 3) or discrete (Ex. 4). 

Example 1. Let X and Y be independent, with X ~ Unif(0, 1) and Y ~ Exp(l). Let X be 
subject to current status censoring with observation time T ~ Unif(0, 0.5) independent of 
(X,Y). 

Example 2. Let X ~ Unif(0, 1), and let Y\X be exponentially distributed with mean 2/(2X + 
1). Let X be subject to current status censoring with observation time T ~ Unif(0, 1) 
independent of (X, Y). 

Example 3. Let X ~ Unif(0,2), and let Y = X. Let X be subject to interval censoring 
case 2 with observation times (Ti,T 2 ), independent of (X, Y) and uniformly distributed over 
{(h,t 2 ) ■ < h < 1,1 <t 2 < 2}. 

Example 4- Let (X, Y) be uniformly distributed over {(x,y) : < x < y < 1}. Let X be 
subject to interval censoring case 2 with observation times (Ti,T 2 ) independent of (X, Y). 
Let the distribution of (T 1 ,T 2 ) be discrete: G{(0.25, 0.5)} = 0.3, G{(0.25, 0.75)} = 0.3 and 
G{(0.5,0.75)} = 0.4. 

For each example we derived the limits a nd of the MLE, using Lemma [3j Details 



of these derivations are given in iMaathuisI (|2006l . Section 9.4). We also computed the MLEs 



F^ and F% and the repaired MLEs F„ and F% for a simulated data set of size n = 10,000. 
For the repaired MLE we used an equidistant grid with K = 20 points as shown in Figured! 

The results are given in Figures [2] - [H These figures show that the MLEs F s n and F% are 
indeed very close to our derived limits F^ and F^. On the other hand, the repaired MLEs 
F^ and F™ are very close to the true underlying distribution Fq. Moreover, the results show 
that there can be a very significant difference between the limit of the MLE and the true 
underlying distribution Fq. 

We now discuss the simulation results in more detail. Figure [5] considers estimation of 
the joint distribution Fq. It shows the contour lines of the MLE F^, its limit F^, and the 
true underlying distribution Fq. Note that F% and F^ are almost indistinguishable, while 
there is a clear difference between F^ and Fq. The results for the upper limits F% and F^ 
are similar and not shown. Results for the repaired MLE are not shown since this estimator 
only takes values for y on a grid. 

Figure [3] considers estimation of the marginal distribution Fqx ■ We see that the MLEs 
Fj[ n and F^ n are close to the derived limits and F^^- Moreover, note that F^ n tends 
to be below Fqx- This can be understood via Remark Q] on page El which explains that F^ 
can be viewed as a right endpoint estimator, and hence tends to have a negative bias. Note 
that the repaired MLE F n closely follows Fqx. 
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Figure H] considers estimation of Fq{xq,u) for fixed xq. The function Fq(xo,u) is often 
estimated as an alternative for Fqy, since Fqy is heavily affected by representational non- 
uniqueness if the support of Ti, . . . , is strictly contained in the support of X, a situation 
that often occurs in practice. The values of xq were chosen to show a range of scenarios for 
the behavior of the MLE, and we see that F n (xQ,y) can be much too large, much too small 
and non-unique. The repaired MLE F n is again close to the underlying distribution. 

Note that our examples are not linked to any specific application. For readers who are 
interested in a comparison between the MLE and the repaired MLE in a practical situation, 
we refer to HMG. They pro vide such a comparison for the HIV/ AIDS vaccine trial data 



VAX004 (jFlvnn et all (|2005l )). as well as for simulated data that mimic the vaccine data. 
They show a difference between the MLE and the repaired MLE in this setting, but the size 
of the difference is quite small. This can be explained by Corollary [21 since the time between 
successive follow-up visits is relatively short (about 6 months) and the infection rate is low. 
Much larger differences can be expected in, for example, cross-sectional HIV studies, where 
there is only one observation time per person. 
[Figure [2] [4] about here.] 

6. Discussion 

We studied the MLE of the bivariate distribution of an interval censored survival time and a 
continuous mark variable. We derived the almost sure limit of the MLE, and showed that the 
MLE is inconsistent in general. We proposed a simple method to repair the inconsistency, 
and illustrated the behavior of the inconsistent and repaired MLE in four examples. 

We were prompted to investigate consistency of the MLE in the interval censored contin- 
uous mark model, since the observed sets in this model can take the form of line segments. 
Such line segments are an indicator of consistency problems for the MLE, since the MLE for 
bivariate cens ored data has been found to be inconsist ent before when such line segments 
were present (|Van der Laan and iMaathuisI (|2003l . Section 6.2)). In this sense our re- 



sults do not come as a surprise, and they confirm the idea that the presence of line segments 
is indicative of consistency problems of the MLE. 

There are, however, interesting differences in the underlying reasons for inconsistency in 
the above men tioned models. The inconsistency of the MLE in the model considered by 



Maathuisl (|2003l ) could be explained by representational non-uniqueness of the MLE. This is 
not the case for the interval censored continuous mark model, where the MLE is typically 
inconsistent even if its limit is fully unique. Rather, the inconsistency in the interval censored 
continuous mark model can be explained by the fact that the cumulative hazard functions 
that define the MLE in (jlOp and (jlip do not converge to the true underlying cumulative 
hazard functions. 

Finally, we provide a more detailed discussion of the connections between the current 



15 



paper and the paper by HMG, since these papers have been heavily influenced by each other. 
HMG started studying the interval censored continuous mark model, in orde r to analyze data 
from the first Phase III HIV/AIDS vaccine trial VAX004 (|Flvnn et all (|20od )^ We suspected 
inconsistency of the MLE in this model, and investigated this issue more closely. This study 
has resulted in the current paper. In turn, our paper has influenced the work of HMG and 
their analysis of the VAX004 data. 

There are also some differences between the models in the two papers. HMG considered 
a slightly more complicated interval censored continuous mark model, assuming that X is 
mixed case interval censored (as discussed in Section 1) instead of case k interval censored. 
They showed that our results in Sections [3] and H] can be generalized to that situation. Thus, 
the MLE is typically inconsistent in this model as well, and this inconsistency can be repaired 
by discretizing the marks. HMG also considered a complication regarding the mark variable 
Y. In addition to assuming that Y is missing for all individuals who did not experience the 
failure event, they allowed Y to be missing with some probability p £ (0, 1) for individuals 
who did experience the failure event. In this case there is no closed form available for the 
MLE. It is therefore more difficult to study consistency issues, and consistency of the MLE in 
this model is currently still an open problem. However, due to the presence of line segments 
we expect inconsistency, and this conjecture is supported by simulation results of HMG. 
HMG therefore included our repaired MLE in the analysis of the VAX004 data. 
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8. Appendix 

This section contains several technical lemmas and proofs that are needed for the results in 
Section O Lemma [T] gives the almost sure limits H, V and Vx of the processes H n , V n , Vxn 
that were defined in (|12|) . Lemma [2] provides the almost sure limits Aqo and Axoo of the 
processes A n and Axn that were defined in (|13[) . Lemma gives the almost sure limits 
and F Xoo of the MLEs F% and F Xn that were given in (|10p and (jlip . Corollary [3] provides an 
alternative way to express F^. Corollaries H] and [5] specialize this result to two special cases, 
namely the case that X and Y are independent, and the case that X is subject to current 
status censoring. Finally, we provide a proof of Corollary [2j 

Lemma 1 For I C M rf with d > 1, and let T>{I) be the space of cadlag functions on I. 
Furthermore, let \\ ■ ||oo be the supremum norm on (T>(M + ),T>(M + ),T>(R + xR)). Then 

|| (EU - H,Y Xn - V x ,Y n - U)|U -a.,. 0, (21) 
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where 



k p k n 

V(x,y)=J2 F (t,y)dG j (t)-Y^ F (s,y)dG j . 1 , j (s,t), (22) 

• =1 J[0,x] - =2 JO<s<t<x 

k n k n 

V x {x) = Y j FoxWdGtf) -J2 F x(s)dG j ^ J (s,t), (23) 

j =1 J[0,x] j =2 JO<s<t<x 

H(x) = V x (x) + f {1 - F 0X (s)}dG k (s), (24) 

J[0,x] 

and Gj-ij and G k are defined in the beginning of Section [27B . 



Proof. Equation (|2ip follows immediately from the Glivenko-Cantelli theorem, with H(x) = 
E(l{U < x}), V(x,y) = E(A + l{U <x,Z < y}) and V x {x) = V(x,oo) = E(A + 1{U < x}). 
We now express H, V and Vx in terms of Fq and G. Note that the events [Aj = 1], 
j = l,...,k + 1, are disjoint. Furthermore, note that U = Tj and Z = Y on [Aj = 1], 
j = 1, . . . , k, and U = T k on [A^+i = 1]. Hence, 

k 

V(x,y) = E(A+1{U <x,Z< y}) =J^ P ( A J = ^ Y < V>Tj < *) 

3=1 

k 

= ^P(Xe (Tj-x,Tj],Y < y,Tj < x) 

3=1 
k 



/Z I {F Q {t,y) - F Q (s,y)}dGj- X j{8,t). 

J0<s<t<x 



3=1 

Using To = 0, X > and G({0 < T\ < • • • < T k }) = 1, this can be written as 

fc k 



J2 [ F (t,y)dG j (t)-Y" [ F (s,y)dG 1 . 1 

i=1 J[0,x] • 2 J0<s<t<x 



3~- 

Taking y = oo yields the expression for Vx(x). The expression for H follows similarly, using 

k 

H(x) = El{U <x} = Y^ p ( A j = 1; T j <x) + P(A k+1 = 1, T k < x). 

3=1 

□ 

Lemma 2 Let \\ ■ ||oo 6e the supremum norm on (X>[0, r], D([0, t] x M)). T/ien 

||(Axn — Axoo, A n — Aqo)!^ — > a , s , 0, 
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where 



A 00 (x,y)= [ , V( i: y) v xe[0,r],y€R, (25) 
J[o,x] 1 - H{s-) 

A Xoo (x) = A 00 (x,<x) = [ - Vx ^. S) , xe[0,r]. (26) 

J[o,x] 1 - H (s-J 



Proof. This proof is similar to the discussion on page 1536 of I Gill Johansenl (|1990l ). For 
all x > 0, let EI~(x) = H n (a;— ). Consider the mappings 

(M-,Y Xn ,Vn) -> ({l-H-j-SVx^Vn) -» 

on the spaces 

(P_[0,t],£>[0,t],P([0,t] xK))^(P_[0,r],D[0,T],I)([0,T] x 

^ (P[0,r],P([0,r] x 



where P_(0, r] is the space of 'caglad' (left-continuous with right limits) functions on (0, t\. 
The first mapping is continuous with respect to the supremum norm when we restrict the 
domain of its first argument to elements of T>- [0, r] that are bounded by say {l+iT(r)}/2 < 1. 
Strong consistency of H~ ensures that it satisfies this bound with probability one for n large 
enough. The second mapping is continuous with respect to the supremum norm by the Helly- 
Bray lemma. Combining the continuity of these mappings with Lemma [1] yields the result of 
the theorem. □ 

Lemma 3 Let \\ ■ ||oo be the supremum norm on (D[0, t],X>([0,t] x IR)). Then 

\\( F Xn ~ F XooiFn ~ F L)\\oo ~^a.s. 0, 

where 

F^ 00 (x) = l-Y[{l-A Xoo (ds)}, (27) 

s<x 

FL(x,v)= f Hil-Ax^ds^A^d^y). (28) 

Ju<x s<u 

Proof. To derive the almost sure limit of F Xn , consider the mapping 



Axn^Y[{l-A X n(ds)} = l-F Xn (x) (29) 



s<x 



suprem um norm 



on the space T>[0, r] to itself. This mapping is continuous with respect to the 
when its domain is restricted to functions of uniformly bounded variation (jGill &: Johansen 
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(jl99d ). Theorem 7). Note that, for s £ [0,r], A Xri (s) < 1/{1 -H„(r)} < 2/{l - ff(r)} with 
probability one for n large enough. Together with the monotonicity of Ax n this implies that 
with probability one Ax n is of uniformly bounded variation on [0, r], for n large enough. The 
almost sure limit of Ft now follows by combining Lemma and the continuity of (|29|). 
To derive the almost sure limit of consider the mapping 



u<x 



Y[{1 - A X n(ds)}A n {du,y) = F*(x,y) 



s<u 



on the space (T>[0, r], T>([0, r] x M.)) to T>([0, r] x M). This mapping is continuous with respect 
to the su premum norm when it s domain is restricted to functions of uniformly bounded 
variation ( Huang Sz Louis (jl998 ). Theorem 1). Note that A n (x,y) < Ax„(i), so that with 
probability one the pair (A n , Aj n ) is uniformly bounded for n large enough. The result then 
follows as in the first part of the proof. □ 



Corollary 3 For x E [0,r],y £ R, we can write 

jTt£ / \ f A 0O (ds,y) € f V(ds,y) . 

F oo( x ,y)= -7 7j^dF Xoo (s)= - dF Xoo (s). (30) 

Proof. Combining equations ([27]) and (f28j) yields 

FL(x,y)= [ {l-Fx^s-ftA^y). (31) 

Taking y = oo gives i^^a?) = i^fo oo) = f [0 x] {l-F Xoo (s-)}A X oc(ds), so that dF Xoo {s) = 
{1 — ^yoo( s— )}Axoo(^s)- Combining this with equation (f3Tj) yields the first equality of ([30]) . 
The second equality follows from the identities 

A 00 (d8,y) = V(ds,y)/{l-H{8-)}, 
Axoo(ds) = V x (ds,y)/{1 - H(s-)}. 

□ 



Corollary 4 Let X and Y be independent. Then 

Fi(x,y) = F XQQ (x)F 0Y (y), x £ [0, r], y € R. (32) 

Proof. If X and F are independent, equations (fT7|) and (fTHj) yield V(ds,y) = Foy (y)Vx"((is). 
Substituting this into equation (|30|) gives the result. □ 
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Corollary 5 Let X be subject to current status censoring (k = 1). Then 

F$ ) (x,y)= f P(Y <y\X <s)dF Xoo (s), x e [0, r], y € R. 

J[0,x] 

Proof. For k = 1 equations (fTTJ) and (fTBj) reduce to V(ds,y) = Fo(s,y)dG(s) and Vx(e?s) = 
F 0X (s)dG(s). Hence, V(ds,y)/V x (ds) = F (s , y) / F ox (s) = P(Y < y\X < s). Substituting 
this into equation (|30p completes the proof. □ 

Proof of Corollary [2} Since the observation times are the order statistics of k i.i.d. uniform 
random variables, the margin al densities q,-, j = 1 . . . . , k and the joint densities gj-ij, j = 



2, . . . , k are known (see, e.g., IShorack fc Wellnerl (|1986l ). page 97). Summing them over j 
yields: 



" k SP> (k - 1\ ft V~ ( A^-U-JJ h 

E»(«) = 5 W«) E 0-0 U) I 1 "*) = « 1mW ' 

j=i i-i=o VJ 7 v 7 v 7 

A fc(A;-l) 1 / t-s\ fc " 2 
2^gj-i,j(s,t) = — — 2 — l[o< s <t<e] I 1 7- ) 

i=2 ' v J 

Let x < 9. Plugging the above expressions for gj and gj-ij into (f22j) . and using Fubini's 
theorem to rewrite the second term of ([22]) . we get 



V\x,y) 



\ f F (t,y)dt - [ I F (s,y)^-^- (l - *—*) ~ dsdt 

V J[0,x] J J 0<s<t<x & \ v J 

- [ F (s,y)(l-^) h \s= f F (s,y)dQ k (s), 

V J\0,x] \ & J J\0,x] 



where, for s < x, 

Qt(s) = I Tl ( 1 - -- ) Or (I- — ) - ( 1 



k I x — r\ , / x — s\ ( x 



o 



Thus, as k — > oo, Q k (s) converges weakly to the distribution function with mass 1 at x. 
Plugging in y = oo in V k (x,y) yields V x {x) = f< 0x \ F( jX (s)dQ k .(s). Furthermore, plugging in 
the expressions for V x and G^ in (|24[) gives 

H k {x)= [ F 0X (s)dQ k (s) + [ (l-F 0X ( s ))J(f) fe_1 d s ./ 
J[0,x] J[0,x] V W/ 

Hence, for x < 9 we have V k (x, y) —> F (x, y), V x (x) — > F ox (x) and 1 — H k (x) — > 1 — i'bxO'O 
as k — > oo for continuity points of the limits. The corollary then follows from the extended 
Helly-Bray theorem. □ 
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H — I — I — I — I — I — I — I — I — I — I — I — I — I — I — 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

observed set 

h(x) 

3 -- • • 

2 - • 

1 -- . 

I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — I — 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

x 

Figure 1: Observed sets (upper panel) and the corresponding height map (lower panel) for 
univariate right censored data, based on the following 7 observations of (U, A): (1, 1), (2.5, 0), 
(5.5, 1), (8,0), (9, 1), (10.5, 1) and (12,0). Note that the maximal intersections are given by 
the local maximum regions of the height map: {1}, {5.5}, {9}, {10.5} and (12, oo). 



q; O 

*g 3 
.3 1 
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Figure 2: Contour lines of the bivariate functions (left column), F^ (middle column) and 
Fq (right column) for Examples 1-4. All functions were computed on an equidistant grid 
with grid size 0.02, and sample size n = 10,000. 
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F x , Example 1 



F x , Example 2 




F x , Example 3 F x , Example 4 




n 1 1 1 r n 1 1 1 1 r 

0.0 0.5 1.0 1.5 2.0 0.0 0.2 0.4 0.6 0.8 1.0 



Figure 3: Estimation of Fqx m Examples 1-4. Dotted: the true underlying distribution 
F ox . Solid grey: the MLEs F Xn and F Xn . Dashed: the limits F Xoo and F Xoo of the MLE. 
Solid black: the repaired MLEs Ft- and F Xn , using the equidistant grid with K = 20 shown 
in Figured! In all cases n = 10,000. 
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F(0.49, y), Example 1 



F(0.5, y), Example 2 




1 2 3 4 1 2 3 

y y 



F(0.75, y), Example 3 F(0.25, y), Example 4 




Figure 4: Estimation of Fq(xq, y) in Examples 1-4, for fixed xq and y € R. Dotted: the true 
underlying distribution Fq(xq, y). Solid grey: the MLEs F^(xq, y) and F™(xq, y). Dashed: the 
limits F^xq, y) and F^(xo,y) of the MLE. Circles: the repaired MLE F^(xo,y) = F%(xo, y), 
using an equidistant grid with K = 20. In all cases n = 10,000. 
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